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This paper addresses the problem of identifying likely topics of texts by their position in the 
text. It describes the automated training and evaluation of an Optimal Position Policy, a 
method of locating the likely positions of topic-bearing sentences based on genre-specific 
regularities of discourse structure. This method can be used in applications such as 
information retrieval, routing, and text summarization. 
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Bigrams and trigrams are commonly used in statistical natural language processing; this 
paper will describe techniques for working with much longer n-grams. Suffix arrays (Manber 
and Myers 1990) were first introduced to compute the frequency and location of a substring 
(n-gram) in a sequence (corpus) of length N. To compute frequencies over all N(N + l)/2 
substrings in a corpus, the substrings are grouped into a manageable number of 
equivalence classes. In this way, a ... 
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Traditional information has relied on the extensive use of statistical parameters in the 
implementation of retrieval strategies. This paper sets out to investigate whether linguistic 
processes can be used as part of a document retrieval strategy. This is done by predefining 
a level of syntactic analysis of user queries only, to be used as part of the retrieval process. 
A large series of experiments on an experimental test collection are reported which use a 
parser for noun phrases as part o ... 
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